Finding and evaluating sets of nearest neighbours

نویسندگان

  • Julie Weeds
  • David Weir
چکیده

In this paper, we consider two applications of distributional similarity measures, probability estimation and prediction of semantic similarity. We investigate whether high performance in one application area is correlated with high performance in the other. This work also provides an evaluation of two state-of-the-art distributional similarity measures and introduces a variant of one. Further, we overcome statistical biases in the standard pseudo-disambiguation task and look at the effect of word and co-occurrence frequency on the performance of the measures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pseudo-Likelihood Inference Underestimates Model Uncertainty: Evidence from Bayesian Nearest Neighbours

When using the K-nearest neighbours (KNN) method, one often ignores the uncertainty in the choice of K. To account for such uncertainty, Bayesian KNN (BKNN) has been proposed and studied (Holmes and Adams 2002 Cucala et al. 2009). We present some evidence to show that the pseudo-likelihood approach for BKNN, even after being corrected by Cucala et al. (2009), still significantly underest...

متن کامل

Incorporating Farthest Neighbours in Instance Space Classification

The nearest neighbour (NN) classifier is often known as a ‘lazy’ approach but it is still widely used particularly in the systems that require pattern matching. Many algorithms have been developed based on NN in an attempt to improve classification accuracy and to reduce the time taken, especially in large data sets. This paper proposes a new classification technique based on kNearest Neighbour...

متن کامل

SNN: A Supervised Clustering Algorithm

In this paper, we present a new algorithm based on the nearest neighbours method, for discovering groups and identifying interesting distributions in the underlying data in the labelled databases. We introduces the theory of nearest neighbours sets in order to base the algorithm S-NN (Similar Nearest Neighbours). Traditional clustering algorithms are very sensitive to the user-defined parameter...

متن کامل

A K-nearest neighbours method based on imprecise probabilities

K-nearest neighbours algorithms are among the most popular existing classification methods, due to their simplicity and their good performances. Over the years, several extensions of the initial method have been proposed. In this paper, we propose a K-nearest neighbours approach that uses the theory of imprecise probabilities, and more specifically lower previsions. We show that the proposed ap...

متن کامل

Concave hull: A k-nearest neighbours approach for the computation of the region occupied by a set of points

This paper describes an algorithm to compute the envelope of a set of points in a plane, which generates convex or non-convex hulls that represent the area occupied by the given points. The proposed algorithm is based on a k-nearest neighbours approach, where the value of k, the only algorithm parameter, is used to control the “smoothness” of the final solution. The obtained results show that t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003